Qwen 3.7 Plus is now available on Serverless, exclusively on Fireworks. Try it today.

Model Library
/Deepseek/DeepSeek R1 (Fast)
Deepseek Logo Mark

DeepSeek R1 (Fast)

Ready
model path:accounts/fireworks/models/deepseek-r1

DeepSeek R1 (Fast) is the speed-optimized serverless deployment of DeepSeek-R1. Compared to the DeepSeek R1 (Basic) endpoint, R1 (Fast) provides faster speeds with higher per-token prices, see https://fireworks.ai/pricing for details. Identical models are served on the two endpoints, so there are no quality or quantization differences. DeepSeek-R1 is a state-of-the-art large language model optimized with reinforcement learning and cold-start data for exceptional reasoning, math, and code performance. The model is identical to the one uploaded by DeepSeek on HuggingFace. Note that fine-tuning for this model is only available through contacting fireworks at https://fireworks.ai/company/contact-us.

DeepSeek R1 (Fast) API Features

Fine-tuning

Docs

DeepSeek R1 (Fast) can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model

On-demand Deployment

Docs

On-demand deployments allow you to use DeepSeek R1 (Fast) on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits.

DeepSeek R1 FAQs

What is DeepSeek R1 (Fast) and who developed it?

DeepSeek R1 is a serverless, speed-optimized deployment of DeepSeek-R1 hosted by Fireworks AI. It uses the same model as DeepSeek R1 (Basic), with faster inference and higher per-token costs. The underlying model, DeepSeek-R1, was developed by DeepSeek and is optimized for advanced reasoning, math, and code generation using a reinforcement learning-first training approach.

What applications and use cases does DeepSeek R1 excel at?

DeepSeek R1 excels at:

  • Multi-step reasoning and logical inference
  • Mathematical problem-solving (e.g., 97.3% on MATH-500)
  • Advanced code generation (2,029 Elo on Codeforces-like tasks)
  • Scientific question answering
  • High-stakes decision-making workflows.
What is the maximum context length for DeepSeek R1?

The maximum context length is 163,840 tokens.

Does DeepSeek R1 support quantized formats (4-bit/8-bit)?

Yes. DeepSeek R1 has multiple quantized variants including 4-bit and 8-bit options.

What is the default temperature of DeepSeek R1 on Fireworks AI?

The recommended default sampling temperature for DeepSeek R1 is 0.6, as used in benchmark evaluations.

What is the maximum output length for DeepSeek R1?

The maximum generation length is 32,768 tokens.

What are known failure modes of DeepSeek R1?

Known issues include:

  • Repetitive or incoherent output if temperature is too low or system prompt is misused
  • The model may skip chain-of-thought reasoning unless prompted with <think>, which can reduce performance on reasoning tasks.
How many parameters does DeepSeek R1 have?
  • Total parameters: 671 billion
  • Activated per forward pass: 37 billion

DeepSeek R1 uses a Mixture of Experts (MoE) architecture to reduce active compute while maintaining model capacity.

Is fine-tuning supported for DeepSeek R1?

Yes. Fireworks supports fine-tuning DeepSeek R1 using LoRA-based adapters. Contact Fireworks for access.

What license governs commercial use of DeepSeek R1?

DeepSeek R1 is licensed under the MIT License, which permits commercial use, modification, and redistribution.

Metadata

State
Ready
Created on
1/20/2025
Kind
Base model
Provider
Deepseek

Specification

Calibrated
Yes
Mixture-of-Experts
Yes
Parameters
671B

Supported Functionality

Fine-tuning
Supported
Serverless
Not supported
Context Length
163k tokens
Function Calling
Not supported
Embeddings
Not supported
Rerankers
Not supported
Support image input
Not supported